MonteverdiV1, Main, Exploration, bibRecord, 000043

Clustering Very Large Dissimilarity Data Sets

Identifieur interne : 000043 ( Main/Exploration ); précédent : 000042; suivant : 000044

Clustering Very Large Dissimilarity Data Sets

Auteurs : Barbara Hammer [Allemagne] ; Alexander Hasenfuss [Allemagne]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.

RBID : ISTEX:AF55A8B5A61638C92019758BF58EAABE61120846

Abstract

Abstract: Clustering and visualization constitute key issues in computer-supported data inspection, and a variety of promising tools exist for such tasks such as the self-organizing map (SOM) and variations thereof. Real life data, however, pose severe problems to standard data inspection: on the one hand, data are often represented by complex non-vectorial objects and standard methods for finite dimensional vectors in Euclidean space cannot be applied. On the other hand, very large data sets have to be dealt with, such that data do neither fit into main memory, nor more than one pass over the data is still affordable, i.e. standard methods can simply not be applied due to the sheer amount of data. We present two recent extensions of topographic mappings: relational clustering, which can deal with general proximity data given by pairwise distances, and patch processing, which can process streaming data of arbitrary size in patches. Together, an efficient linear time data inspection method for general dissimilarity data structures results. We present the theoretical background as well as applications to the areas of text and multimedia processing based on the generalized compression distance.

Url:

https://api.istex.fr/document/AF55A8B5A61638C92019758BF58EAABE61120846/fulltext/pdf

DOI: 10.1007/978-3-642-12159-3_24

Affiliations:

Allemagne

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000309
to stream Istex, to step Curation: 000257
to stream Istex, to step Checkpoint: 000022
to stream Main, to step Merge: 000056
to stream Main, to step Curation: 000056

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Clustering Very Large Dissimilarity Data Sets</title>
<author><name sortKey="Hammer, Barbara" sort="Hammer, Barbara" uniqKey="Hammer B" first="Barbara" last="Hammer">Barbara Hammer</name>
</author>
<author><name sortKey="Hasenfuss, Alexander" sort="Hasenfuss, Alexander" uniqKey="Hasenfuss A" first="Alexander" last="Hasenfuss">Alexander Hasenfuss</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:AF55A8B5A61638C92019758BF58EAABE61120846</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-12159-3_24</idno>
<idno type="url">https://api.istex.fr/document/AF55A8B5A61638C92019758BF58EAABE61120846/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000309</idno>
<idno type="wicri:Area/Istex/Curation">000257</idno>
<idno type="wicri:Area/Istex/Checkpoint">000022</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Hammer B:clustering:very:large</idno>
<idno type="wicri:Area/Main/Merge">000056</idno>
<idno type="wicri:Area/Main/Curation">000056</idno>
<idno type="wicri:Area/Main/Exploration">000043</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Clustering Very Large Dissimilarity Data Sets</title>
<author><name sortKey="Hammer, Barbara" sort="Hammer, Barbara" uniqKey="Hammer B" first="Barbara" last="Hammer">Barbara Hammer</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>CITEC, University of Bielefeld</wicri:regionArea>
<wicri:noRegion>University of Bielefeld</wicri:noRegion>
<wicri:noRegion>University of Bielefeld</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Hasenfuss, Alexander" sort="Hasenfuss, Alexander" uniqKey="Hasenfuss A" first="Alexander" last="Hasenfuss">Alexander Hasenfuss</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Clausthal University of Technology</wicri:regionArea>
<wicri:noRegion>Clausthal University of Technology</wicri:noRegion>
<wicri:noRegion>Clausthal University of Technology</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">AF55A8B5A61638C92019758BF58EAABE61120846</idno>
<idno type="DOI">10.1007/978-3-642-12159-3_24</idno>
<idno type="ChapterID">Chap24</idno>
<idno type="ChapterID">24</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Clustering and visualization constitute key issues in computer-supported data inspection, and a variety of promising tools exist for such tasks such as the self-organizing map (SOM) and variations thereof. Real life data, however, pose severe problems to standard data inspection: on the one hand, data are often represented by complex non-vectorial objects and standard methods for finite dimensional vectors in Euclidean space cannot be applied. On the other hand, very large data sets have to be dealt with, such that data do neither fit into main memory, nor more than one pass over the data is still affordable, i.e. standard methods can simply not be applied due to the sheer amount of data. We present two recent extensions of topographic mappings: relational clustering, which can deal with general proximity data given by pairwise distances, and patch processing, which can process streaming data of arbitrary size in patches. Together, an efficient linear time data inspection method for general dissimilarity data structures results. We present the theoretical background as well as applications to the areas of text and multimedia processing based on the generalized compression distance.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
</list>
<tree><country name="Allemagne"><noRegion><name sortKey="Hammer, Barbara" sort="Hammer, Barbara" uniqKey="Hammer B" first="Barbara" last="Hammer">Barbara Hammer</name>
</noRegion>
<name sortKey="Hasenfuss, Alexander" sort="Hasenfuss, Alexander" uniqKey="Hasenfuss A" first="Alexander" last="Hasenfuss">Alexander Hasenfuss</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/MonteverdiV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000043 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000043 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Musique
   |area=    MonteverdiV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:AF55A8B5A61638C92019758BF58EAABE61120846
   |texte=   Clustering Very Large Dissimilarity Data Sets
}}

This area was generated with Dilib version V0.6.21.
Data generation: Mon May 9 21:59:15 2016. Site generation: Mon Feb 12 09:57:54 2024

	Serveur d'exploration sur Monteverdi
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur Monteverdi

Clustering Very Large Dissimilarity Data Sets

Clustering Very Large Dissimilarity Data Sets

Source :

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri